feat(gemma3): Add BiGemma3 and ColGemma3 models with Matryoshka embeddings #362

adithya-s-k · 2025-12-06T15:56:27Z

Summary

This PR adds support for BiGemma3 and ColGemma3 models based on the Gemma3-4B-IT backbone, enabling multilingual multimodal document retrieval.

Key Features

BiGemma3: Single-vector dense retrieval model with Matryoshka representation learning
- Supports flexible embedding dimensions (768, 1536, 2560) at inference time
- Efficient document retrieval with configurable accuracy/speed trade-offs
ColGemma3: Multi-vector late interaction model using ColBERT-style architecture
- Fine-grained token-level matching with MaxSim scoring
- 128-dimensional per-token embeddings

Changes

Added BiGemma3 model in colpali_engine/models/gemma3/bigemma3/
- modeling_bigemma.py: Model implementation with Matryoshka support
- processing_bigemma.py: Processor for images and text
Added ColGemma3 model in colpali_engine/models/gemma3/colgemma3/
- modeling_colgemma.py: Multi-vector model implementation
- processing_colgemma.py: Processor with MaxSim scoring
Added comprehensive tests in tests/models/gemma3/
- Unit tests for model loading
- Integration tests for forward pass
- Retrieval tests with visual documents

API Design

BiGemma3 allows choosing embedding dimension at inference time:

from colpali_engine.models import BiGemma3, BiGemmaProcessor3

model = BiGemma3.from_pretrained(
    "Cognitive-Lab/NetraEmbed",
    torch_dtype=torch.bfloat16,
    device_map="cuda",
)

# Choose dimension at inference time
embeddings = model(**inputs, embedding_dim=1536)

ColGemma3 uses standard multi-vector late interaction:

from colpali_engine.models import ColGemma3, ColGemmaProcessor3

model = ColGemma3.from_pretrained(
    "Cognitive-Lab/ColNetraEmbed",
    torch_dtype=torch.bfloat16,
    device_map="cuda",
)

embeddings = model(**inputs)  # (batch, num_patches, 128)

Models

Related Work

Paper: M3DR: Towards Universal Multilingual Multimodal Document Retrieval
Blog: Introducing NetraEmbed
Benchmark: NayanaIR-Bench

- Introduced BiGemma3 and BiGemmaProcessor3 for image and text processing. - Added ColGemma3 and ColGemmaProcessor3 for late interaction retrieval. - Implemented model and processor classes with appropriate forward methods. - Created unit tests for BiGemma3 and ColGemma3 models and their processors. - Ensured compatibility with existing Gemma3 architecture and added necessary processing utilities.

…mension validation and improved processor loading

- Implemented offline and online testing for BiGemma3 using Matryoshka embeddings. - Created synthetic images and queries for testing across multiple dimensions (768, 1536, 2560). - Validated image and query encoding, similarity scoring, and retrieval performance. - Configured Modal app with necessary dependencies and environment settings. - Added comprehensive logging and validation checks for test results.

… 3 embeddings

- Implemented `serve_hf_snapshot.py` for HuggingFace model serving with optimized cold start and warmup. - Introduced `serve_vllm_snapshot.py` for vLLM model serving with sleep mode and GPU memory snapshots. - Added comprehensive benchmark report for inference performance in `INFERNECE_PERFORMANCE.md`. - Both scripts support FastAPI endpoints for embedding generation and health checks. - Configured deployment settings including GPU type, memory, and scaledown behavior.

…arameter and adjust forward method for validation

Inference demo

ManuelFay · 2025-12-10T23:08:10Z

Will look at this tmrw ! Thanks !

athrael-soju · 2025-12-10T23:42:03Z

@adithya-s-k could you add some interpretability maps from your tests? Just checking how they differ from colmodernvbert and colqwen3

…tability maps

…bility in generate_interpretability_maps.py

adithya-s-k · 2025-12-11T04:58:38Z

@athrael-soju , i have pushed the code for the interpretability do check it out

ManuelFay

LGTM but can you please ruff the code and fix the tests so CI pass ?

ManuelFay · 2025-12-16T14:04:08Z

for ruff it's probably not even on you. the CI was disabled last week with the shai hulud bug so ruff checks didn't pass on one iof the merged PRs.
I'll merge right after !

ManuelFay · 2025-12-16T14:17:40Z

The gemma tests because of the model gating. Do you have a base model (like we do for all the supported architectures) that is initialized with the final projection ? Otherwise init will be randonm everytime if we start from gemma (+ the gating problem).

adithya-s-k · 2025-12-17T04:31:20Z

Hey i have just pushed two model
These are not gated and should be easy to test they are just the base gemma model and for the col model , base model + the projection layers

also the final checkpoints can be used

ManuelFay · 2025-12-18T16:54:04Z

yeah looks good ! can you update the PR to include them so we run the CI again ?
you can also add the results of the final model in the readme (along with the link if you want)

…fied - Adithya S K

adithya-s-k · 2025-12-18T18:50:56Z

Hi @ManuelFay , I have made all the requested changes.
Updated the PR to include the ungated base models for both BiGemma3 and ColGemma3, fixed the model references in tests, and added the interpretability tests.

I have locally verified that the tests now run correctly with these models and everything looks good on my side.

ManuelFay · 2025-12-18T18:59:34Z

Awesome, I'll merge my linting PR and then merge yours ! Thanks a ton !

ManuelFay · 2025-12-19T10:26:01Z

can you rebase on main (will make the ruff CI happy) ? then we will merge !

ManuelFay

Only 1 ruff error remaining. Then we can merge, rest looks nice !
Thanks again !

tests/interpretability/test_interpretability_workflow.py

…kflow.py

adithya-s-k · 2025-12-29T12:49:12Z

@ManuelFay have fixed the ruff issue , i think everything should be set to merge the pr

ManuelFay · 2025-12-29T18:28:41Z

We should document more clearly how to ruff but the CI is still failing.

You need to run ruff format --check.

I approved and will merge right after

…hya S K

adithya-s-k · 2025-12-30T14:22:48Z

@ManuelFay have run it locally and tested, it should pass all the checks now

ManuelFay · 2025-12-30T18:52:12Z

Thank you for the contributions ! Don't hesitate to submit your model results on the MTEB visual retrieval leaderboard !

adithya-s-k and others added 8 commits December 3, 2025 23:58

feat(gemma3): Enhance BiGemma3 and ColGemma3 models with embedding di…

4b902fe

…mension validation and improved processor loading

feat(gemma3): remove deprecated offline and serving scripts for Gemma…

b2a7e4c

… 3 embeddings

Remove test script for BiGemma3 with Matryoshka embeddings using vLLM

c7701fc

fix(gemma3): Update BiGemma3 initialization to remove embedding_dim p…

a05d314

…arameter and adjust forward method for validation

Merge pull request #1 from adithya-s-k/inference_demo

89d0ddf

Inference demo

adithya-s-k added 2 commits December 11, 2025 10:08

feat(interpretability): add example for generating ColGemma3 interpre…

0efedfa

…tability maps

refactor(interpretability): streamline imports and improve code reada…

da3ebc9

…bility in generate_interpretability_maps.py

ManuelFay requested changes Dec 16, 2025

View reviewed changes

adithya-s-k added 4 commits December 18, 2025 23:15

fix(tests): update model name references to Nayana-cognitivelab models

6629a55

fix(tests): update model name references to Cognitive-Lab models

2f0fd14

fix(tests): add interpretability test and fixed bigemma test and veri…

00e284e

…fied - Adithya S K

feat(docs): add Cognitive-Lab model references to README

8441a46

Merge branch 'illuin-tech:main' into main

b890412

ManuelFay requested changes Dec 26, 2025

View reviewed changes

tests/interpretability/test_interpretability_workflow.py Outdated Show resolved Hide resolved

fix(tests): remove unused import of List in test_interpretability_wor…

0d54dad

…kflow.py

ManuelFay approved these changes Dec 29, 2025

View reviewed changes

Add interpretability tests and ColNetraEmbed/NetraEmbed models - Adit…

77fa873

…hya S K

ManuelFay merged commit 8b6700f into illuin-tech:main Dec 30, 2025
6 checks passed

feat(gemma3): Add BiGemma3 and ColGemma3 models with Matryoshka embeddings #362

feat(gemma3): Add BiGemma3 and ColGemma3 models with Matryoshka embeddings #362

Uh oh!

Conversation

adithya-s-k commented Dec 6, 2025

Summary

Key Features

Changes

API Design

Models

Related Work

Uh oh!

ManuelFay commented Dec 10, 2025

Uh oh!

athrael-soju commented Dec 10, 2025

Uh oh!

adithya-s-k commented Dec 11, 2025

Uh oh!

ManuelFay left a comment

Choose a reason for hiding this comment

Uh oh!

ManuelFay commented Dec 16, 2025

Uh oh!

ManuelFay commented Dec 16, 2025

Uh oh!

adithya-s-k commented Dec 17, 2025

Uh oh!

ManuelFay commented Dec 18, 2025

Uh oh!

adithya-s-k commented Dec 18, 2025

Uh oh!

ManuelFay commented Dec 18, 2025

Uh oh!

ManuelFay commented Dec 19, 2025

Uh oh!

ManuelFay left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

adithya-s-k commented Dec 29, 2025

Uh oh!

ManuelFay commented Dec 29, 2025

Uh oh!

adithya-s-k commented Dec 30, 2025

Uh oh!

Uh oh!

ManuelFay commented Dec 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants